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Abstract 

Populations of heterogeneous cells play an important role in many bi- 
ological systems. In this paper we consider systems where each cell can 
be modelled by an ordinary differential equation. To account for hetero- 
geneity, parameter values are different among individual cells, subject to 
a distribution function which is part of the model specification. 

Experimental data for heterogeneous cell populations can be obtained 
from flow cytometric fluorescence microscopy. We present a heuristic ap- 
proach to use such data for estimation of the parameter distribution in the 
population. The approach is based on generating simulation data for sam- 
ples in parameter space. By convex optimisation, a suitable probability 
density function for these samples is computed. 

To evaluate the proposed approach, we consider artificial data from 
a simple model of the tumor necrosis factor (TNF) signalling pathway. 
Its main characteristic is a bimodality in the TNF response: a certain 
percentage of cells undergoes apoptosis upon stimulation, while the re- 
maining part stays alive. We show how our modelling approach allows to 
identify the reasons that underly the differential response. 

Keywords: estimation of probability distribution, population models, 
TNF signalling 

1 Introduction 

Modelling in systems biology typically aims at achieving a quantitative descrip- 
tion of intracellular signal transduction or differentiation processes at the cellular 
level. Most models describe a "typical single cell" on the basis of experimen- 
tal data obtained from cell populations. However, to understand dynamical 
behaviour within heterogeneous cell populations, a consideration of many cells 
within the whole population is mandatory. 

Phcnotypic heterogeneity in genetically identical cells arises mainly from 
stochasticity in biochemical reac tions, unequal partitionin g of cellular material 
at cell division ( Mantzarisl . 2007 ). or epigenetic differences ( Avery , 2006 ). When 



considering cells with high mutation rate, such as cancer cells, also genotypic 



f 



heterogeneity plays a major role. For this paper, we choose to model hetero- 
geneity by differences in parameter values of the model describing the process 
of interest. The model structure is on the contrary assumed to be identical 
in all cells, as it usually represents the physical interactions among molecules, 
which should be independent of the cell's state. The parametric approach is 
well suited for genetic and epigenetic differences. We assume that interactions 
among cells in the population can be neglected for the process to be studied. 
This is indeed the case for many relevant signalling pathways, and is also an 
implicit assumption in many single-cell models. The distribution of parameter 
values within the considered cell population is described by a suitable multi- 
variate probability distribution function, which needs to be part of the model 
specification. Mathematical modelling of such a process will typically result in 
a non-linear pa rtial differential e quation for the probability distributions of the 
state variables ( Mantzarisl 2007b . Since this is very hard to deal with, we pro- 
pose to use a sample based approach, consisting of a large collection of ordinary 
differential equation systems of identical structure, but with differing parameter 
values which are subject to a specified parameter distribution function. 

In this paper, we explore the possibility to estimate the parameter distribu- 
tion function using experimental large-scale measurements of the distributions 
of system variables within the cell population. Such data is available on a suit- 
able scale from a newly developed measurem ent technology, the flow cytometric 



fluorescence microscopy (jOrtyn et all 120071 ). which is a combination of classical 



flow cytometry and fluorescence microscopy. 

Classical flow cytometry is a long-established too l to obtain distribution s 
of system variables in heterogeneous cell populations ( Perez and Nolan . 20061 ). 
To measure the activity of signalling proteins, suitable fluorescence markers are 
introduced into the cells. A stream of several thousand cells per second is then 
injected into the measurement device, and the fluorescence intensity of each 
individual cell can be measured. While static flow cytometry measurements are 
very common in experimental setups, co rresponding time cou rse data are rarely 



collected and are typically quite sparse (jGardner et al. , 2000) 



Fluorescence microscopy is another established experimental tool, where mi- 
croscopic images from a population of fluorescently labelled cells are collected 
and evaluated by image analysis. With the classical technical implementation, 
fluorescence microscopy is limited to small sample numbers. Yet percentages 
of cells showing a particular feature, such as an apoptotic phenotype, are com- 
monly measured at several time instances. Als o distributions of rele vant vari- 
ables over a time course have been measured ( Mettetal et all 120061) . but the 
technology is not widely used in dynamical modelling studies. 

Flow cytometric fluorescence microscopy now combines flow cytometry with 
single cell fluorescence microscopy by taking microscopic images of individual 
cells while they pass the flow cytometer. This allows to collect and analyse 
microscopic images of several thousand fluorescently labelled cells per minute 



(jOrtvn et all [2007), with technological requirements similar to classical flow 



cytometry. In this way, distributions of signalling protein activities can be mea - 



sured efficiently in large populations of heterogeneous cells (IGeorge et all 2006). 
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Although the technology has not been used so far to obtain distributions of rel- 
evant variables at several time instances, such measurements are now becoming 
experimentally feasible. 

Estimation of parameter distributions in model collections that represent a 



heter ogeneous population is a long-standing topic in pharmacodynamics (jAl-Banna et al 



199df ). However, a crucial difference between pharmacological experiments and 



cell population measurements is that in pharmacodynamics, samples are taken 
from the same individuals at all time points, measurements are linked to indi- 
viduals, and as a consequence individual trajectories are known. This is not 
the case in fluorescence microscopy, where each individual cell is measured only 
once, and for each time point only the distribution of the measured variable 
within the population is recorded. 

Other established approaches to parameter estimation of probabilistic sys- 
tems usually consider a problem setup where the output of a single cell is directly 
considered as available measurement data at all time instances. This is quite 
different to our setup, where each individual cell can only be taken for mea- 
surement once, and thus only the distribution of output variables within the 
population is reliably known for all sampling times. As a consequence, estab- 
lished approaches of parameter estimation seem not to be well suited to deal 
with this problem. 

In this paper, we present a heuristic approach to estimate the parameter 
distribution from the distributions of measured variables. In a first step, simu- 
lation data is generated for a suitable choice of parameter samples. A s such, the 



approach is related to classical particle filters (jDoucet et all l200ll ). However, 
instead of an iterative updating, we construct a convex optimisation problem 
that produces a suitable weighting for the considered parameter samples. This 
weighting can directly be transformed into a probability distribution for the 
parameter values. 

The paper is structured as follows. In Section [2l the population modelling 
framework that we are using is introduced. In Section^ we present the proposed 
method to estimate the parameter distribution function of the model based on 
population measurements. Section U describes the application of the proposed 
method to simulated data for a model of the TNF signalling pathway, and 
discusses how to use population modelling in order to evaluate differences in 
cellular behavior within a heterogeneous cell population. 

Notation: Denote by [z, z] C K fe the hyperrectangle {z € M. k : Zi < z% < 
Zi,i = l,.. .,k}. 

2 Parameter-distributed population models 

For the purpose of this paper, a model of a biochemical reaction network in a 
population of N cells is given by the collection of differential equations 

7r«), xW(0)=4 l) , 
y^(t)=h(xW(t)), i = l,...,N 
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with state variables x(t) £ R", measured variables y(t) £ M 9 , and parameters 
7r € f. The index i specifies the individual cells within the population. We 
collect the parameters and initial condition in the extended parameter vector 
pW = (ttW, atg ) S l m , where m — n + r. 

We assume that the population is heterogeneous, where heterogeneity is 
accounted for by differences in parameter values among individual cells. The 
distribution of parameters and initial conditions is given by a cumulative prob- 
ability distribution function $ : R m — > [0, 1] which is part of the model specifi- 
cation, i.e. parameter values and initial conditions for the cell with index i are 
subject to the probability distribution 

Prob(pW < p u .. . ,pW < Pm ) =$(p 1 ,... ,p m ). (2) 

Due to the measurement technology, the output of every individual cell can 
only be measured once during the course of an experiment, because the cell is 
removed from the population for the measurement. Thus, instead of consid- 
ering the measured output y^ directly, it makes more sense to consider the 
distribution of y^ at sampling times tk, k = 1, . . . , K as an output. At each 
sampling time, M cells are selected arbitrarily from the population and sub- 
jected to measurement. We assume that M is large enough such that a reliable 
approximation of the output distribution within the whole population can be 
obtained. The measurement taken from |lj is thus given by functions 

* fc (2/)=Prob(y«(* fc )<y), k=l,...,K, i=l,...,N. (3) 

The goal of parameter distribution estimation is to compute the function . . . ,p m ) 
from knowledge of the functions ^k(y) and the model structure ([T]). Typically 
the measurement of ^k(y) is discretized over suitable hyperrectangles in the 
variable y. Since the number of cells being measured is finite, the values of 
^k(y) are discrete as well, although for most considerations we can assume the 
number of cells large enough to neglect this. 



3 Parameter estimation method for population 
models 

3.1 Maximum Entropy approach to probability density es- 
timation 

The proposed estimation method is based on the simulation of {1} for all pa- 
rameter values and initial conditions contained in a finite sample 

V = {p {l) eR™ :i = l,...,M}. 

A good choice for a sampling set is the so called Latin hypercu be, wh i ch ens ures 
that the total range of the relevant parameter set is captured (jSteinl . Il987l ) . 
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Definition 1 A finite set V C [p,p] C M. m is called a latin hypercube in M m 
with sampling density d, if it contains exactly one point p G K m such that 

(a -l) — 2 — <Pi-Pi<a — 2 — W 

for each i = 1, . . . , m and a = 1, . . . , d. 

Having chosen the sampling set V as a Latin hypercube, the goal is to 
estimate the fraction of cells, tp(p^), which have an extended parameter vector 
close to p^ l \ such that the weighted simulated trajectories approximate the 
measured population dynamics reasonably well. This fraction is a measure for 
the relative contribution of the neighbourhood of p"> to the cell population 
response. Additionally, the fractions have to sum up to one, 

M 



_>(P W ) = 1. (5) 

i=l 

Hence, (p(p( 1 ') can be interpreted as an approximation of the probability density 
function at pW . 

In order to calculate <p(p^)> the output space is divided into hyperrectangles. 
For each sampling time, the fraction of cells of the population which is contained 
in each hyperrectangle is computed according to the following definition. An 
illustration is shown in Figure [T] 

Definition 2 A q- dimensional array Y(tk) 6 M^ 1 x ■ ■ ■ x P q is called a discretized 
population distribution at time tk with discretization vector (3 — [Pi, ... , [3 q \ , if 

Y hu ... nq) (t k ) = Prob(y«(f fe ) g [y\y J ]), (6) 

where 71 = 1, . . . , fa, . . . , 7 9 = 1, . . . , (3 qi and yl = y l + (7, - 1) ' ' and 

Pi 

yl = Vi+ Ji— 77^—7 i = 1, ... ,11%. Hereby x/i and yi are the minimal respectively 
Pi 

maximal values of output i which are measured. 



Y< 



(71) 



i. |.. h> " i< i» »i > y 



(a) 



■>7i 



1 2 



3 4 
(b) 



Figure 1: Illustration of the discretized population distribution, (a) shows as 
dots the measured outputs at time tk and (b) depicts the corresponding dis- 
cretized population distribution for /3 = 5. 
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The array Y(tk) can be interpreted as a discrete approximation of the prob- 
ability density function of the outputs. To compute the <f(p^'), i = 1, . . . , M, 
the system (TTJ) is simulated for every pW g ~p an d the obtained outputs are 
discretized. 

Definition 3 A q- dimensional array Y^(t k ) G Rftx-x/ 3 * is called a discretized 
trajectory of y^> at time points tk, k = 1,...,K, with discretization vector 
p=\ft 1 ,...,p q ],jf 

(*»'--T«) v * ; [0, otherwise, (7) 
7i = 1,. .. ,f3i, .. . , 7 g = 1, .. .,/3 q . 

Hereby is the output of the system obtained by simulation with j)'*'. y" 1 , y 7 
are defined as before. 

The array Y^(tk) can also be interpreted as an approximation of the prob- 
ability density function for cells with the parameter vector pW . Given the 
discretized measured population dynamics and the discretized simulated tra- 
jectories, we intend to compute the fractions, yi(pw), such that the difference 
between the weighted sum of simulated trajectories and the measured popula- 
tion dynamics, 

M 

AY(p, t k ) = V(P {1) )Y^ (t k ) - Y(t k ), (8) 

z=l 

is zero, for k = 1, . . . , K, where 



# , ),-,»'(P lM) ) 



T 



(9) 



This problem could be solved using standard least square techniques, but this 
leads in many cases to dramatical overfitting. Especially, if the measurement 
data does not contain enough information to fit the parameter distribution, a 
spiky probability density function is obtained. Least square techniques generally 
select a minimum norm solution for underdetermined systems, so it can happen 
that parameters which are not identifiable show a peak in the probability density 
function at a single point. 

This should be avoided, because it is desirable that the resulting distribution 
indicates whether a parameter is identifiable given the measured data or not. 
Therefore, we choose an entropy based approach to determine ^(pw). 

Definition 4 The function Ent : R M — > K given by 

M 

En%) = -5>(p«)m(^(p«)). (10) 
i=l 

is called the entropy of ip. 
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Given an underdetermined system of equations, the maximum entropy ap- 
proach selects the solution which contains least infor mation, and thu s avoids 
adding artificial information to the measurement data ( MacKavl . l2003f ) . In our 



case this implies that the "flattest" probability distribution which fits all the 
constraints is selected in the optimisation problem. Thus, if a parameter is not 
identifiable, we obtain a very flat distribution and no information is added. 
The entropy approach yields the optimisation problem 

max Ent(y) 

s.t. AY(<p,t k ) = 0, k = l,...,K 
l T <p =1 
<P >0, 

where 1 = [1, . . . , 1] T G R M . The solution of (fTTj) is the weighting vector ip, 
with the highest entropy which exactly reproduces the discretized measured 
population dynamics. Unfortunately, (jlip is very likely to be infeasible because 
even if the equation 

AY(<p,t k )=0, k = l,...,K, (12) 

is underdetermined, it cannot be ensured that a solution exists. Reasons are 
measurement errors and small cell numbers in measurements, but primarily an 
insufficient discretization of the parameter space. To improve the feasibility, 
small discrepancies between the measured and the weighted simulated popula- 
tion are allowed. This leads to the relaxed problem 



max Ent(y>) 

s.t. AY(cp,t k ) e[-e,e], k = l,...,K 

l T ip =1 

f > o, 



(13) 



where AY(ip,tk) G [— e, e] denotes the constraint that each element of AY is 
bounded between — e and +e. As before, the other constraints are that all 
weights sum up to one, and that all weights are greater than or equal to zero. 

We are left with the problem to define the error bound e. A known constraint 
is that e 6 [0,1]. To obtain estimation results that fit the measurements as good 
as possible, e is decreased to the minimal value for which (|T3)) is still feasible. 
This is done via a bisection algorithm. 

The relaxed optimisation problem (| 13(1 is convex. The entropy is concave 
and the constraints are linear. For the class of convex optimisation prob- 
lems efficient solvers exist, for instance the primal-dual-interior point method 
( Bovd and Vandenberghel 2004 ) . Convergence to the global maximum in poly- 



nomial time can be guaranteed. 

Based on the solution of (fl"3"|) . an estimate <t for the parameter distribution 
function <f> is computed as 



(14) 
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3.2 Distribution estimation for independent parameters 



A simplifying yet convenient assumption is that parameter values and initial 
conditions are independently distributed. Although not strictly true in most 
cases, it is reasonable to make this simplification also if parameters are only 
weakly correlated. In this case, the probability distribution function can be 
decomposed as 

$(p)=#l(pi)* 2 (P2)"-*m(Pm), (15) 

where ^i(pi) denotes the distribution function for the i-th parameter. 

Based on the estimate $ (flU) , estimates for the individual distribution 
functions can be computed by marginalising the other parameters, i.e. taking 
&i(pi) = lim Pj _, 00! & {p)- Thus an estimate $, for the individual distributions 
is obtained as 

E ftP®) (16) 

for i — 1, . . . , m. 



4 Application to a TNF signal transduction model 
4.1 Motivation for population modelling 

TNF is a signalling hormone involved in the inflammatory response of mam- 
malian cells. It can induce programmed cell death (apoptosis) vi a the caspase 



casca de, but has also anti-apoptotic effects via the NF-kB pathway (jWaiant et al 



2003). For many cell types, stimulation with TNF will incude apoptosis in a 



certain percentage of the population, while the remaining cells stay alive. The 
reasons for this heterogeneous behaviour are unclear, but of great interest for 
biological research in TNF signalling. However, a major obstacle to the direct 
experimental study of the process is that the behaviour of individual cells cannot 
be monitored on a population scale over the time scale of interest. To overcome 
this problem, we propose the use of population modelling and estimation of 
parameter distributions from experimental population data. With a suitable 
model, a collection of single cell trajectories can be clustered according to the 
individual cell's fate and compared for characteristic differences in parameters 
or early-stage cell behaviour. 

In this paper, we use artificial measurement data, generated from simu- 
lations, for two reasons. First, suitable experimental data is not yet available. 
Second, the purpose of the paper is more an evaluation of the estimation method 
itself than its application in biological research. Also, since no general results on 
parameter identifiability in the considered problem are available, such a study 
should be done in each application of the method to evaluate identifiability 
properties. 
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Table 1: Nominal parameter values for the TNF signalling model (fl7|) . 



4.2 Presentation of the TNF signal transduction model 



The model is based on earlier work from lChaves et al.1 (|2008l ) and is built from 
known activating and inhibitory interactions among key signalling proteins. It 
includes as state variables activities of the caspases 8 and 3 (C8, C3), the tran- 
scription factor NF-kB and its inhibitor I-kB. The model is given by the ODE 
system 

±i = -X\ + - (At (x 3 )ati (u) + a 3 (x 2 )) 
x 2 = -x 2 + a 2 (xi)/3 3 {x 3 ) 

X 3 = -X 3 + /3 2 (x 2 )/3 5 {x4) 

x 4 = -x A + -(/3i(w) + a 4 (x 3 )). 

The state variables Xi, i = 1, ... ,4 are bounded between and 1 and denote 
the relative activities of the signalling proteins C8, C3, NF-kB and I-kB, re- 
spectively. The functions aj(xi), j — 1, ... ,4 represent activating connections 

and are given by ay (a;,-) = -fi -i . Correspondingly f3 3 {xi) = tttzz, J = 1, ... , 5 

represent inhibiting connections, dj and bj are parameters with values between 
and 1, representing activation and inhibition thresholds, respectively. The in- 
put u denotes the external TNF stimulus. Nominal parameter values are given 
in Table Q] 



4.3 Results of parameter distribution estimation 

To evaluate the proposed approach we consider a virtual experimental setup 
in which the caspase 3 and NF-kB activity is measured at the time points 
t £ {0, 0.5, 1, 2, 4, 6, 8, 10, 15, 20} by flow cytometric microscopy. For each time 
point, the outputs of 10000 simulated cells are obtained, resulting in an output 
density distribution for each time point. Measurement errors are neglected in 
this example to make the interpretion of the results as simple as possible. 

For this example, we assume heterogeneity in the parameters oi, an, b 2 , and 
b 3 . For the generation of measurement data, each of the heterogeneous parame- 
ters is assumed to be distributed according to a log-normal distribution around 
the nominal values given in table [TJ Each cell is assumed to have an initial con- 
dition which corresponds to the steady state with x± — x 2 — for u — 0, where 
x 3 and X4 depend on the individual parameter values. The considered hetero- 
geneity is interesting, because it results in a bimodal response of the population 
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Figure 2: Measured output distributions for caspase 3 activity at the considered 
sampling times. The horizontal axis gives the caspase 3 activity, the size of the 
bars indicates the relative frequency in the population. 



to a TNF pulse applied during the time interval < t < 2. The output distri- 
butions in caspase 3 activity that would be measured in this case are shown in 
Figured About 35 % of the population returns to zero caspase 3 activity, while 
the remaining cells show sustained caspase activity. The probability density 
functions of the parameter values are shown in Figure [3] 

For the identification of the parameter distribution, the lower and upper 
bounds of all parameters are set to be respectively 1. The sampling density d of 
the Latin hypercube is set to 2000 and (3 = [10, 10] T is selected as discretization 
vector for the outputs. The probability density functions that are estimated for 
the different parameters by our method are depicted in figure[31 in comparison to 
the real density functions. As can be seen in the figure, the probability density 
functions of CI4, 62, and 63 are approximated very well. Also for a\ we can see 
good agreement. The distributions peak at approximately the same point and 
the shape is roughly the same. 

Although all parameters are identifiable, there are huge differences in the 
idcntifiability of the single parameters. The cumulative probability functions 
of 04 and &3 can be estimated very well with a sampling density d of only 250 
(results not shown). For the approximation of 62 and especially a± more samples 
in the parameter space have to be taken. This can be related to the observation 
from the analysis done in Section 14.41 that 04 and 63 are of high relevance for 
the bimodal response of the cells, while the other two parameters do not have 
a high influence on this property. 

For the considered example, the computation is quite efficient. Computation 
time on a standard desktop computer was on the order of a few minutes, and 
most of the time was spent computing trajectories for individual parameter 
values. In the proposed algorithm, this task can easily be parallelised for more 
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Figure 3: Comparison of actual (dashed) and estimated (full) parameter prob- 
ability function. 

complex models. 

4.4 Analysis of the population model 

In this section, we discuss the biological conclusions that can be drawn from 
a computational analysis of the population model (|17p with parameter distri- 
butions as used in Section 14.31 For the considered system, it is of particular 
interest to distinguish between cells that undergo apoptosis and cells that stay 
alive. In apoptotic cells, the state variable X2 (caspase 3 activity) tends to a 
positive value larger than a threshold 9. For non-apoptotic cells, X2 returns to 
zero after a small transient rise. 

In order to investigate the underlying differences that lead to such a dif- 
ferential behaviour, we consider a sample of parameter values taken from 
the distributions specified in Section 14.31 giving rise to trajectories xW (t) of 
(TT7|) . The parameter samples are clustered into the apoptotic set A and the 
non-apoptotic set C by the criterium 

A = {p (l) | xf (T end ) >9}, £ = {p« | pW g A}j (18) 

where 9 = 0.3 for this study. 

First let us compare the sets A and C by directly examining the respective 
parameter values. As seen from Figured! the differences between the cells can 
mainly be explained from differences in the value of the parameter 63, which is 
the threshold for NF-kB to inhibit caspase 3 activation, and the parameter 04, 
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Figure 4: Comparison of parameter values for apoptotic (x) and non-apoptotic 
(o) cells and approximate separation <| 1 9|) in the 04-63 plane. 




Figure 5: Comparison of a few trajectories for apoptotic (dashed) and non- 
apoptotic (full lines) cells. 



which is the threshold for NF-kB to activate I-kB. In fact, an approximative 
separation criterium can be obtained directly from Figure [4] as 

p £ b 3 > 0.16 + 0.21a4. (19) 

Apoptotic cells are thus characterised by high values for 63 and low values for 
04, which relates well to biological intuition. The parameter 62, which is the 
threshold for caspase 3 to inhibit NF-kB activation, seems to have little to no 
influence on the cell fate. These results indicate that the cell fate is determined 
by influences from the NF-kB pathway to the caspase cascade, and not vice 
versa. 

Next, we try to find early-stage markers for the cell's fate. This is of interest 
because individual parameter values are not known when observing a single cell 
by e.g. live cell imaging, yet we may want to predict the fate of a specific indi- 
vidual cell. The collection of trajectories for the considered parameter sample 
is shown in Figure O Obviously, early-stage caspase activity is a good indicator 
for the later fate of the cell. However, it is quite interesting from the biological 
viewpoint that early-stage NF-kB activity seems not to be a good indicator. In 
fact, the NF-kB trajectories in Figure separate only for t > 10, a time for 
which most apoptotic cells show already high caspase activity. 
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5 Summary and Conclusions 



Heterogeneity in cell populations is an important aspect for research in systems 
biology. However, computational approaches to deal with heterogeneous popu- 
lations are rare. A reasonable way to describe heterogeneity is to assume that 
parameter values are stochastically distributed within the population. 

In the modelling process, it is then necessary to estimate the parameter 
distribution functions from suitable experimental data. For this paper, we as- 
sume that the output distribution in the cell population is measured at discrete 
sampling times. We present an optimisation-based approach to estimate pa- 
rameter distributions from such measurements, which minimizes the prediction 
error based on a suitable sampling of the parameter space. With the sug- 
gested latin hypercube sampling, the approach scales well to systems with a 
high-dimensional parameter space. 

We applied the suggested estimation method to artificial data for a model of 
TNF signal transduction. For the parameters where heterogeneity was assumed, 
our method gives good estimates of the parameter distribution function. The re- 
sults thus indicate that those parameters are identifiable from the measurements 
used in this setup. 
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